W5. Pumping Lemma for Regular Languages, Pushdown Automata

Author

Manuel Mazzara

Published

February 19, 2026

1. Summary

1.1 Why Some Languages Are Not Regular

Recall that a regular language is any language that can be recognized by a finite state automaton (FSA). FSAs are powerful tools, but they have a fundamental limitation: they have only a fixed, finite amount of memory—precisely the current state. An FSA with \(n\) states can “remember” only which of those \(n\) states it is in, nothing more.

This limitation means FSAs cannot handle patterns that require counting to an arbitrary depth. For example:

  • \(L = \{a^n b^n \mid n \geq 1\}\): To recognize this, the machine must count \(n\) a’s and then verify exactly \(n\) b’s. No matter how many states the FSA has, a sufficiently large \(n\) will overflow its memory.
  • \(L = \{ww^R \mid w \in \{a,b\}^*\}\): Even-length palindromes. To recognize these, the machine must remember the entire first half of the string, which can be arbitrarily long.

To prove that a language is regular, we just exhibit a working FSA. To prove that a language is not regular, however, we face a harder task: we must show that no possible FSA can recognize it. Checking every finite automaton is clearly impossible. Instead, we use a powerful theoretical tool: the Pumping Lemma.

1.2 The Pigeonhole Principle

Before stating the Pumping Lemma, we need the classic Pigeonhole Principle:

If \(n\) pigeons are placed into \(m\) pigeonholes and \(n > m\), then at least one pigeonhole must contain at least two pigeons.

pigeonhole p1 State q0 p2 State q1 p3 State q2 t1 visit 1 t1->p1 t2 visit 2 t2->p2 t3 visit 3 t3->p3 t4 visit 4 t4->p2 repeated state

Pigeonhole principle applied to automata: more visits than states force a repetition

Application to FSAs: Think of the states of an FSA as pigeonholes and the transitions traversed while reading a string as pigeons. If a string has length greater than the number of states \(|Q|\), then while processing the string, the automaton visits more than \(|Q|\) states (counting repetitions). Since there are only \(|Q|\) distinct states, at least one state must be visited twice. That repeated state corresponds to a cycle in the automaton.

1.3 Infinite Regular Languages Must Have Cycles

Consider an infinite regular language \(L\) recognized by an FSA with \(m\) states. Since \(L\) is infinite, there must be strings in \(L\) of arbitrarily large length. When the FSA processes a sufficiently long string (one with more than \(m\) symbols), it visits more than \(m\) transitions. By the Pigeonhole Principle, some state \(q\) must be revisited.

pumping_loop start q0 q0 start->q0 q q q0->q x q->q y qf qf q->qf z

A long accepting path must contain a loop that can be pumped

This means the automaton’s path looks like: \[\text{start} \xrightarrow{\text{read } x} q \xrightarrow{\text{read } y} q \xrightarrow{\text{read } z} \text{accept}\] where \(y\) is the string that “loops” back to \(q\). The crucial observation: since the machine is deterministic and \(y\) returns to \(q\), you can go around the loop any number of times: \[x y^0 z, \quad x y^1 z, \quad x y^2 z, \quad x y^3 z, \quad \ldots\] all lead to the same accepting state. Therefore, all these strings belong to \(L\).

1.4 The Pumping Lemma for Regular Languages
1.4.1 Formal Statement

Pumping Lemma: If \(L \subseteq \Sigma^*\) is a regular language, then there exists an integer \(m \geq 1\) (called the pumping length or critical length) such that for every string \(w \in L\) with \(|w| \geq m\), there exists a decomposition \(w = xyz\) satisfying:

  1. \(|y| \geq 1\) (the “pump” \(y\) is non-empty)
  2. \(|xy| \leq m\) (the pump occurs early in the string)
  3. \(xy^i z \in L\) for all \(i \geq 0\) (pumping any number of times stays in \(L\))

The pumping length \(m\) can be taken to be the number of states in the FSA recognizing \(L\).

Proof sketch: Let \(M = (Q, \Sigma, \delta, q_0, F)\) be an FSA with \(|Q| = m\) states recognizing \(L\). Take any \(w \in L\) with \(|w| \geq m\). While reading \(w\), the machine passes through at least \(m+1\) states (including the start). By the Pigeonhole Principle, some state \(q\) is repeated. Let the path be:

\[q_0 \xrightarrow{x} q \xrightarrow{y} q \xrightarrow{z} q_f \in F\]

Since the first repetition must occur within the first \(m+1\) states visited (i.e., after reading at most \(m\) symbols), we have \(|xy| \leq m\). Since \(y\) is the string taking us from \(q\) back to \(q\), we have \(|y| \geq 1\). And since \(\delta^*(q, y) = q\), pumping \(y\) any number of times still returns to \(q\), after which \(z\) takes us to the accepting state. \(\square\)

1.4.2 Necessary but Not Sufficient

The Pumping Lemma gives only a necessary condition for regularity, not a sufficient one. This means:

  • If \(L\) is regular \(\Rightarrow\) \(L\) satisfies the Pumping Lemma (guaranteed)
  • If \(L\) satisfies the Pumping Lemma \(\not\Rightarrow\) \(L\) is regular (some non-regular languages also satisfy it!)

Therefore:

  • Cannot use it to prove a language is regular
  • Can use it to prove a language is not regular (via the contrapositive)
1.4.3 The Contrapositive: Proving Non-Regularity

The contrapositive of the Pumping Lemma is:

If for every integer \(m \geq 1\) there exists a string \(w \in L\) with \(|w| \geq m\) such that for every decomposition \(w = xyz\) with \(|y| \geq 1\) and \(|xy| \leq m\), there exists some \(i \geq 0\) with \(xy^i z \notin L\), then \(L\) is not regular.

This is the tool we actually use. Think of it as a two-player game:

Player 1 (Adversary) Player 2 (You)
Picks any \(m \geq 1\) You choose \(w \in L\) with \(\|w\| \geq m\)
Picks any split \(w = xyz\) with \(\|y\| \geq 1\) and \(\|xy\| \leq m\) You find \(i \geq 0\) such that \(xy^iz \notin L\)

You win (language is not regular) if you can always respond successfully, regardless of the adversary’s choices.

1.4.4 Standard Proof Template

Here is the standard structure for a Pumping Lemma proof:

  1. Assume for contradiction that \(L\) is regular.
  2. Let \(m \geq 1\) be the pumping length given by the Pumping Lemma.
  3. Choose a specific word \(w \in L\) with \(|w| \geq m\) (pick it cleverly so that any valid split must pump outside \(L\)).
  4. Consider an arbitrary valid split \(w = xyz\) with \(|y| \geq 1\) and \(|xy| \leq m\).
  5. Use the constraint \(|xy| \leq m\) to restrict where \(y\) can appear.
  6. Find a specific \(i \geq 0\) such that \(xy^i z \notin L\).
  7. Conclude that the Pumping Lemma is violated, so \(L\) is not regular.

Choosing the right word is the most creative step. A good choice forces \(y\) to consist of a single type of symbol, making it easy to show that pumping breaks the language’s structure.

1.4.5 Classic Example: \(L = \{a^n b^n \mid n > 0\}\)

Claim: \(L = \{a^n b^n \mid n > 0\}\) is not regular.

Proof:

  1. Suppose \(L\) is regular. Let \(m\) be the pumping length.
  2. Choose \(w = a^m b^m \in L\). Note \(|w| = 2m \geq m\).
  3. Consider any split \(w = xyz\) with \(|xy| \leq m\) and \(|y| \geq 1\).
  4. Since \(|xy| \leq m\) and the first \(m\) characters of \(w\) are all \(a\)’s, the substring \(xy\) lies entirely within the \(a\)-block. Therefore \(x = a^p\) and \(y = a^q\) for some \(p \geq 0\), \(q \geq 1\) with \(p + q \leq m\).
  5. So \(z = a^{m-p-q} b^m\).
  6. Consider \(i = 2\): \(xy^2 z = a^p \cdot a^{2q} \cdot a^{m-p-q} b^m = a^{m+q} b^m\).
  7. Since \(q \geq 1\), the pumped string has \(m + q > m\) a’s and only \(m\) b’s. Therefore \(xy^2 z \notin L\).

This contradicts the Pumping Lemma. Therefore \(L\) is not regular. \(\square\)

Three cases of the lecture analysis: The lecture also directly analyzed all possible splits without relying on \(|xy| \leq m\), to show every possible split breaks:

  • Case 1: \(y = a^k\) (only a’s) — pumping gives \(a^{m+k} b^m \notin L\) (more a’s than b’s)
  • Case 2: \(y = b^k\) (only b’s) — pumping gives \(a^m b^{m+k} \notin L\) (more b’s than a’s)
  • Case 3: \(y = a^k b^s\) (mixed) — pumping gives \(a^{m-k}(a^k b^s)^2 b^{m-s} \notin L\) (b’s appear in the middle)

In all cases, some pumped string is not of the form \(a^n b^n\).

1.5 Pushdown Automata
1.5.1 Motivation: Going Beyond Regular Languages

The Pumping Lemma shows that FSAs are insufficient for languages like \(\{a^n b^n\}\) because they lack memory for counting. To recognize such languages, we need a model with more memory. The natural extension is to add a stack to the FSA, yielding a Pushdown Automaton (PDA).

The hierarchy of computational models (from weakest to strongest) looks like:

  • Combinational logic (no memory, just Boolean circuits)
  • Finite State Automata (fixed, finite memory: only the current state)
  • Pushdown Automata (unbounded stack memory: can recognize context-free languages)
  • Turing Machines (unlimited tape memory: can compute anything computable)

PDAs sit precisely in the layer above FSAs and are capable of recognizing context-free languages (CFLs), which include most programming language constructs (nested brackets, balanced parentheses, etc.).

1.5.2 What is a Stack?

A stack is a last-in, first-out (LIFO) data structure. Think of a stack of plates: you can only add or remove from the top.

stack_ops top top: c mid b top->mid pop pop top->pop low a mid->low z0 Z₀ low->z0 push push push->top

Stack behavior: push adds to the top, pop removes the most recent symbol

Operations:

  • Push: Insert a new symbol on top of the stack.
  • Pop: Remove the top symbol from the stack (also “reading” it).

The stack always has a special bottom-of-stack symbol \(Z_0\) that marks the bottom. You can check whether the stack is “empty” (contains only \(Z_0\)).

Example — push \(a\), then \(b\), then \(c\):

Starting from an empty stack (containing only \(Z_0\)):

  1. After pushing \(a\): stack (top to bottom) \(= a, Z_0\)
  2. After pushing \(b\): stack \(= b, a, Z_0\)
  3. After pushing \(c\): stack \(= c, b, a, Z_0\)
  4. After popping: \(c\) is removed; stack \(= b, a, Z_0\)

The last symbol pushed (\(c\)) is the first to be popped. This is the LIFO property.

Historical note: The concept of a stack was introduced by Alan Turing in his 1946 paper on the Automatic Computing Engine (ACE). He called the operations BURY and UNBURY, and used them in the theory of subroutines.

1.5.3 Informal Description

An FSA has a read head on an input tape and a finite control unit that remembers the current state.

A Pushdown Automaton is identical, but adds a stack as external memory:

  • The finite control reads the next input symbol and the top of the stack.
  • Based on these (and the current state), it transitions to a new state and replaces the top of the stack with a string of symbols.

pda_arch input Input Tape control q input->control read a or ε next q' control->next transition stack Stack A A Z₀ stack->control top symbol next->stack replace top

PDA architecture: finite control reads input and the top of the stack, then updates both

The PDA can:

  • Push symbols onto the stack (by replacing the top with a longer string)
  • Pop the top symbol (by replacing it with \(\varepsilon\))
  • Leave the stack unchanged (by replacing the top with itself)
  • Make \(\varepsilon\)-moves (read nothing from input, only modify the stack)
1.5.4 Formal Definition

A Pushdown Automaton is a 7-tuple:

\[P = (Q,\, \Sigma,\, \Gamma,\, \delta,\, q_0,\, Z_0,\, F)\]

where:

  • \(Q\): finite set of states
  • \(\Sigma\): finite input alphabet
  • \(\Gamma\): finite stack alphabet (symbols that can appear on the stack)
  • \(\delta: Q \times (\Sigma \cup \{\varepsilon\}) \times \Gamma \rightarrow \mathcal{P}(Q \times \Gamma^*)\): the transition function
  • \(q_0 \in Q\): the initial state
  • \(Z_0 \in \Gamma\): the initial stack symbol (bottom-of-stack marker)
  • \(F \subseteq Q\): the set of accepting (final) states

Reading the transition function: A transition \(\delta(q, a, A) \ni (q', \alpha)\) means: > In state \(q\), reading input symbol \(a\) (or \(\varepsilon\)), with \(A\) on top of the stack: move to state \(q'\) and replace \(A\) with the string \(\alpha\).

Transitions are written on arrows as: \(a,\, A / \alpha\) meaning “read \(a\) from input, pop \(A\) from stack, push \(\alpha\).”

  • To push \(B\) below \(A\): write \(A / BA\) (replace \(A\) with \(BA\); \(B\) is now on top)
  • To pop \(A\): write \(A / \varepsilon\) (replace \(A\) with nothing)
  • To leave stack unchanged: write \(A / A\)
1.5.5 Moves of a PDA

Each step of a PDA depends on three things:

  1. The current state \(q\)
  2. The current input symbol (or \(\varepsilon\) for a silent move)
  3. The top of the stack

In one move, the PDA simultaneously:

  • Changes state to some \(q' \in Q\)
  • Advances the input head (or stays put for \(\varepsilon\)-moves)
  • Replaces the top-of-stack symbol \(A\) with a string \(\alpha \in \Gamma^*\)

Since \(\delta\) maps to a set of possible next configurations, PDAs are inherently nondeterministic in their general form. (A PDA is deterministic if \(|\delta(q, a, A)| \leq 1\) for all \(q, a, A\) and if \(\delta(q, a, A) \neq \emptyset\) implies \(\delta(q, \varepsilon, A) = \emptyset\).)

1.5.6 Acceptance Conditions

A PDA can accept input in two different ways:

  • Acceptance by final state: The input is fully consumed and the PDA is in an accepting state \(q \in F\). The stack content does not matter.
  • Acceptance by empty stack: The input is fully consumed and the stack is empty (contains only \(Z_0\) or nothing). The state does not matter.

For nondeterministic PDAs, these two acceptance modes are equivalent: any language accepted by one mode can be accepted by the other with an appropriate PDA. For deterministic PDAs, however, the two modes are not equivalent.

Note: Acceptance by empty stack has a subtle restriction: the language accepted by an empty-stack PDA must be prefix-free (no string in the language is a proper prefix of another string in the language). This is because once the stack empties on reading \(w\), processing any extension \(wz\) becomes impossible.

1.5.7 Step-by-Step Example: PDA for \(\{a^n b^n \mid n \geq 1\}\)

The key idea: push one \(A\) for each \(a\), then pop one \(A\) for each \(b\). If the stack is exactly empty when all \(b\)’s are consumed, the counts matched.

PDA construction:

States: \(p_0\) (start), \(p_1\) (reading \(a\)’s), \(p_2\) (reading \(b\)’s), \(p_3\) (accepting)

Stack alphabet: \(\Gamma = \{Z_0, A\}\), with \(Z_0\) as the bottom marker.

Transitions:

  • \(p_0 \to p_1\): \(a,\, Z_0 / AZ_0\) — read first \(a\); push \(A\) on top of \(Z_0\)
  • \(p_1 \circlearrowleft\): \(a,\, A / AA\) — read subsequent \(a\)’s; push another \(A\)
  • \(p_1 \to p_2\): \(b,\, A / \varepsilon\) — first \(b\); pop one \(A\)
  • \(p_2 \circlearrowleft\): \(b,\, A / \varepsilon\) — subsequent \(b\)’s; pop one \(A\) each time
  • \(p_2 \to p_3\): \(\varepsilon,\, Z_0 / Z_0\) — end of input; if \(Z_0\) is on top (all \(A\)’s were popped), accept

Trace for input “aabb”:

Step State Remaining Input Stack (top first) Transition
1 \(p_0\) \(\mathbf{a}abb\) \(Z_0\) \(p_0 \to p_1\): \(a, Z_0/AZ_0\)
2 \(p_1\) \(\mathbf{a}bb\) \(A, Z_0\) \(p_1 \circlearrowleft\): \(a, A/AA\)
3 \(p_1\) \(\mathbf{b}b\) \(A, A, Z_0\) \(p_1 \to p_2\): \(b, A/\varepsilon\)
4 \(p_2\) \(\mathbf{b}\) \(A, Z_0\) \(p_2 \circlearrowleft\): \(b, A/\varepsilon\)
5 \(p_2\) \(\varepsilon\) \(Z_0\) \(p_2 \to p_3\): \(\varepsilon, Z_0/Z_0\)
6 \(p_3\) \(\varepsilon\) \(Z_0\) ACCEPT

2. Definitions

  • Regular Language: A language \(L \subseteq \Sigma^*\) that can be recognized by some finite state automaton. Equivalently, a language for which there exists an FSA \(M\) with \(L = L(M)\).
  • Pumping Lemma: A theorem stating that every regular language \(L\) has a pumping length \(m \geq 1\) such that any word \(w \in L\) with \(|w| \geq m\) can be split as \(w = xyz\) with \(|y| \geq 1\), \(|xy| \leq m\), and \(xy^i z \in L\) for all \(i \geq 0\).
  • Pumping Length (\(m\)): The critical length in the Pumping Lemma; can be taken as the number of states in the FSA recognizing the language. Any string in the language of length \(\geq m\) must be “pumpable.”
  • Pumping (of a string): Repeating the middle portion \(y\) of a decomposition \(w = xyz\) any number of times. Pumping \(i\) times gives \(xy^iz\).
  • Contrapositive of the Pumping Lemma: If for every \(m \geq 1\) there exists \(w \in L\) with \(|w| \geq m\) such that every valid split \(w = xyz\) (with \(|y| \geq 1\), \(|xy| \leq m\)) has some \(i \geq 0\) with \(xy^iz \notin L\), then \(L\) is not regular.
  • Pigeonhole Principle: If \(n\) objects are placed into \(k < n\) containers, at least one container must hold more than one object.
  • Cycle (in an FSA): A path in the state diagram that starts and ends at the same state. In infinite regular languages, cycles are necessary for generating infinitely many strings.
  • Pushdown Automaton (PDA): A computational model consisting of a finite control unit, an input tape, and a stack. Formally a 7-tuple \((Q, \Sigma, \Gamma, \delta, q_0, Z_0, F)\). PDAs recognize exactly the class of context-free languages.
  • Stack: A last-in, first-out (LIFO) data structure used by a PDA as its memory. Supports two operations: push (add to top) and pop (remove from top).
  • Stack Alphabet (\(\Gamma\)): The finite set of symbols that may appear on the PDA’s stack. Typically includes a special bottom-of-stack marker \(Z_0\).
  • Bottom-of-Stack Symbol (\(Z_0\)): A special symbol at the bottom of the PDA’s stack, used to detect when the stack is empty.
  • PDA Transition Function (\(\delta\)): A function \(\delta: Q \times (\Sigma \cup \{\varepsilon\}) \times \Gamma \to \mathcal{P}(Q \times \Gamma^*)\) specifying the PDA’s moves. A transition \(\delta(q, a, A) \ni (q', \alpha)\) means: in state \(q\), reading \(a\) from input and \(A\) from the stack top, move to state \(q'\) and replace \(A\) with string \(\alpha\).
  • \(\varepsilon\)-move: A transition in a PDA that reads no input symbol. The PDA changes state and modifies the stack without consuming any input character.
  • Acceptance by Final State: A PDA accepts a string if, after consuming all input, it is in a state \(q \in F\).
  • Acceptance by Empty Stack: A PDA accepts a string if, after consuming all input, the stack is empty (contains only \(Z_0\)). Equivalent to acceptance by final state for nondeterministic PDAs.
  • Context-Free Language (CFL): A language recognized by a pushdown automaton, or equivalently, generated by a context-free grammar. Every regular language is context-free, but not vice versa.
  • Nondeterministic PDA: A PDA where \(\delta(q, a, A)\) may contain multiple possible transitions (a set of size \(> 1\)). The PDA accepts if some sequence of choices leads to acceptance.
  • Deterministic PDA (DPDA): A PDA with at most one possible transition at each step: \(|\delta(q, a, A)| \leq 1\) for all \(q, a, A\), and \(\delta(q, a, A) \neq \emptyset\) implies \(\delta(q, \varepsilon, A) = \emptyset\).

3. Formulas

  • Pumping Lemma (Regular Languages): For regular \(L\), \(\exists m \geq 1\) such that \(\forall w \in L\) with \(|w| \geq m\), \(\exists\) split \(w = xyz\) with \(|y| \geq 1\), \(|xy| \leq m\), and \(\forall i \geq 0: xy^iz \in L\)
  • Contrapositive (for proofs): \(L\) is not regular if \(\forall m \geq 1\), \(\exists w \in L\) with \(|w| \geq m\) such that \(\forall\) splits \(w = xyz\) with \(|y| \geq 1\) and \(|xy| \leq m\), \(\exists i \geq 0: xy^iz \notin L\)
  • Length after pumping: \(|xy^iz| = |w| + (i-1)|y|\)
  • Constraint on split location: \(|xy| \leq m\) forces \(y\) to lie within the first \(m\) symbols of \(w\)
  • PDA transition notation: \(a,\, A / \alpha\) — read \(a\) from input, pop \(A\) from stack, push \(\alpha\); write \(\varepsilon, A / \alpha\) for a silent (\(\varepsilon\)-)move
  • PDA formal definition: \(P = (Q, \Sigma, \Gamma, \delta, q_0, Z_0, F)\) where \(\delta: Q \times (\Sigma \cup \{\varepsilon\}) \times \Gamma \to \mathcal{P}(Q \times \Gamma^*)\)
  • Acceptance by final state: \(w\) is accepted iff \(\delta^*(q_0, w, Z_0) \ni (q_f, \gamma)\) for some \(q_f \in F\) and \(\gamma \in \Gamma^*\)

4. Examples

4.1. Prove \(L_1 = \{vv^R \mid v \in \{a,b\}^*\}\) is Not Regular (Lab 5, Task 1)

Use the Pumping Lemma to prove that \(L_1 = \{vv^R \mid v \in \Sigma_1^*\}\) over \(\Sigma_1 = \{a, b\}\) is not a regular language.

(Note: \(v^R\) denotes the reverse of string \(v\). The language \(L_1\) consists of all even-length palindromes over \(\{a, b\}\).)

Click to see the solution

Key Concept: Note that \(vv^R\) is always a palindrome (since \((vv^R)^R = (v^R)^R v^R = vv^R\)). Choose the word \(a^m b^{2m} a^m\): pumping the left \(a\)-block yields a string with unequal \(a\)-blocks, which cannot be a palindrome.

  1. Assume for contradiction that \(L_1\) is regular. Let \(m \geq 1\) be the pumping length.
  2. Choose the word \(w = a^m b^{2m} a^m \in L_1\) (take \(v = a^m b^m\), so \(v^R = b^m a^m\) and \(vv^R = a^m b^m b^m a^m = a^m b^{2m} a^m\)). Note \(|w| = 4m \geq m\).
  3. Consider an arbitrary split \(w = xyz\) with \(|xy| \leq m\) and \(|y| \geq 1\).
  4. Restrict \(y\): The first \(m\) characters of \(w\) are all \(a\)’s, and \(|xy| \leq m\), so \(xy\) lies in the \(a\)-prefix. Thus \(x = a^p\), \(y = a^k\) with \(k \geq 1\), \(p + k \leq m\), and \(z = a^{m-p-k} b^{2m} a^m\).
  5. Pump with \(i = 0\) (remove one copy of \(y\)): \[xy^0 z = xz = a^p \cdot a^{m-p-k} b^{2m} a^m = a^{m-k} b^{2m} a^m\]
  6. Verify failure: The string \(a^{m-k} b^{2m} a^m\) has \(m-k\) leading \(a\)’s and \(m\) trailing \(a\)’s. For it to be in \(L_1 = \{vv^R\}\) (an even-length palindrome), the \((m-k+1)\)-th character from the left must equal the \((m-k+1)\)-th character from the right.
    • The \((m-k+1)\)-th character from the left: the first \(m-k\) characters are \(a\)’s, so position \(m-k+1\) is the first \(b\). Character = \(b\).
    • The \((m-k+1)\)-th character from the right: the last \(m\) characters are \(a\)’s. Since \(m > m-k\) (as \(k \geq 1\)), position \(m-k+1\) from the right is still in the trailing \(a\)-block. Character = \(a\).
    • Since \(b \neq a\), the string is not a palindrome, so \(xy^0 z \notin L_1\).
  7. Conclusion: The Pumping Lemma is violated. Therefore \(L_1\) is not regular. \(\square\)

Answer: \(L_1 = \{vv^R \mid v \in \{a,b\}^*\}\) is not regular.

4.2. Prove \(L_2 = \{v \mid v \text{ has equal numbers of } a\text{'s and } b\text{'s}\}\) is Not Regular (Lab 5, Task 2)

Use the Pumping Lemma to prove that \(L_2 = \{v \in \{a,b\}^* \mid \#_a(v) = \#_b(v)\}\) is not a regular language over \(\Sigma_2 = \{a, b\}\).

Click to see the solution

Key Concept: Any string in \(L_2\) has equal counts of \(a\)’s and \(b\)’s. Choosing \(a^m b^m\) forces \(y\) into the \(a\)-block; pumping down removes \(a\)’s without touching \(b\)’s, breaking the balance.

  1. Assume for contradiction that \(L_2\) is regular. Let \(m \geq 1\) be the pumping length.
  2. Choose the word \(w = a^m b^m \in L_2\) (\(m\) \(a\)’s and \(m\) \(b\)’s). Note \(|w| = 2m \geq m\).
  3. Consider an arbitrary split \(w = xyz\) with \(|xy| \leq m\) and \(|y| \geq 1\).
  4. Restrict \(y\): Since the first \(m\) characters are \(a\)’s and \(|xy| \leq m\), \(y\) lies in the \(a\)-block. So \(y = a^k\) with \(k \geq 1\) and \(z = a^{m-p-k} b^m\).
  5. Pump with \(i = 0\): \[xy^0 z = xz = a^{m-k} b^m\]
  6. Verify failure: \(\#_a(xy^0 z) = m - k\) and \(\#_b(xy^0 z) = m\). Since \(k \geq 1\), \(m - k < m\), so the counts are unequal. Therefore \(xy^0 z \notin L_2\).
  7. Conclusion: The Pumping Lemma is violated. Therefore \(L_2\) is not regular. \(\square\)

Answer: \(L_2 = \{v \mid \#_a(v) = \#_b(v)\}\) is not regular.

4.3. Prove \(L_3 = \{a^{n!} \mid n \geq 0\}\) is Not Regular (Lab 5, Task 3)

Use the Pumping Lemma to prove that \(L_3 = \{a^{n!} \mid n \geq 0\}\) over \(\Sigma_3 = \{a\}\) is not a regular language.

(Note: \(n! = n \times (n-1) \times \cdots \times 1\) is the factorial of \(n\), with \(0! = 1\). So \(L_3 = \{a, aa, a^6, a^{24}, a^{120}, \ldots\}\).)

Click to see the solution

Key Concept: Factorials grow faster than any linear function. Consecutive factorials satisfy \((m+1)! - m! = m \cdot m!\), which is far larger than \(m\) for large \(m\). Any pump of size \(\leq m\) cannot bridge the gap from \(m!\) to the next factorial \((m+1)!\), so some pumped word falls strictly between two consecutive factorials and is not in \(L_3\).

  1. Assume for contradiction that \(L_3\) is regular. Let \(m \geq 1\) be the pumping length.
  2. Choose the word \(w = a^{m!} \in L_3\). Note \(|w| = m! \geq m\) for \(m \geq 1\).
  3. Consider an arbitrary split \(w = xyz\) with \(|xy| \leq m\) and \(|y| \geq 1\).
  4. Determine \(|y|\): Let \(|y| = k\). Since \(1 \leq |y| \leq |xy| \leq m\), we have \(1 \leq k \leq m\).
  5. Pump with \(i = m + 1\): \[|xy^{m+1} z| = |w| + m \cdot k = m! + mk\]
  6. Verify failure:
    • Lower bound: \(m! + mk \geq m! + m \cdot 1 = m! + m > m!\) (since \(m \geq 1\)).
    • Upper bound: \(m! + mk \leq m! + m \cdot m = m! + m^2\).
    • Gap to next factorial: \((m+1)! = (m+1) \cdot m! = m! + m \cdot m!\). We need \(m! + mk < (m+1)!\), i.e., \(mk < m \cdot m!\), i.e., \(k < m!\). Since \(k \leq m < m!\) for \(m \geq 3\), this holds.
    • Therefore \(m! < m! + mk < (m+1)!\), so \(m! + mk\) is strictly between two consecutive factorials and equals no factorial.
    • Hence \(xy^{m+1} z = a^{m! + mk} \notin L_3\).
    (For \(m = 1\): the only split has \(k = 1\). We use \(i = 0\): \(|xy^0 z| = 1 - 1 = 0\). Since \(0 \neq n!\) for any \(n \geq 0\) (as \(n! \geq 1\) always), \(xy^0 z = \varepsilon \notin L_3\). \(\checkmark\))
  7. Conclusion: In all cases, some pumped word is not in \(L_3\). The Pumping Lemma is violated. Therefore \(L_3\) is not regular. \(\square\)

Answer: \(L_3 = \{a^{n!} \mid n \geq 0\}\) is not regular.

4.4. Prove \(L_4 = \{a^n b^l c^{n+l} \mid n, l \geq 0\}\) is Not Regular (Lab 5, Task 4)

Use the Pumping Lemma to prove that \(L_4 = \{a^n b^l c^{n+l} \mid n, l \geq 0\}\) over \(\Sigma_4 = \{a, b, c\}\) is not a regular language.

Click to see the solution

Key Concept: In \(L_4\), the number of \(c\)’s equals the total count of \(a\)’s and \(b\)’s combined. Choose \(a^m b^m c^{2m}\); the constraint forces \(y\) into the \(a\)-block. Pumping down removes some \(a\)’s, breaking the sum constraint.

  1. Assume for contradiction that \(L_4\) is regular. Let \(m \geq 1\) be the pumping length.
  2. Choose the word \(w = a^m b^m c^{2m} \in L_4\) (take \(n = l = m\), so \(n + l = 2m\)). Note \(|w| = 4m \geq m\).
  3. Consider an arbitrary split \(w = xyz\) with \(|xy| \leq m\) and \(|y| \geq 1\).
  4. Restrict \(y\): The first \(m\) characters of \(w\) are all \(a\)’s, so \(|xy| \leq m\) forces \(y\) to lie entirely in the \(a\)-block. Thus \(y = a^k\) with \(k \geq 1\), and \(z = a^{m-p-k} b^m c^{2m}\).
  5. Pump with \(i = 0\): \[xy^0 z = xz = a^{m-k} b^m c^{2m}\]
  6. Verify failure: For \(xy^0 z \in L_4\), we’d need \(n' = m - k\) a’s, \(l' = m\) b’s, and the number of \(c\)’s to equal \(n' + l' = (m-k) + m = 2m - k\). But the actual number of \(c\)’s is \(2m \neq 2m - k\) (since \(k \geq 1\)). Therefore \(xy^0 z \notin L_4\).
  7. Conclusion: The Pumping Lemma is violated. Therefore \(L_4\) is not regular. \(\square\)

Answer: \(L_4 = \{a^n b^l c^{n+l} \mid n, l \geq 0\}\) is not regular.

4.5. Prove \(L = \{a^n b^n \mid n > 0\}\) is Not Regular (Tutorial 5, Example 1)

Use the Pumping Lemma to prove that \(L = \{a^n b^n \mid n > 0\}\) is not a regular language.

Click to see the solution

Key Concept: Choose the word \(a^m b^m\). The constraint \(|xy| \leq m\) forces \(y\) to lie entirely in the \(a\)-block. Any pumping then unbalances the \(a\)-to-\(b\) ratio.

  1. Assume for contradiction that \(L\) is regular. Let \(m \geq 1\) be the pumping length.
  2. Choose the word \(w = a^m b^m \in L\). Note \(|w| = 2m \geq m\), so the Pumping Lemma applies.
  3. Consider an arbitrary split \(w = xyz\) with \(|xy| \leq m\) and \(|y| \geq 1\).
  4. Restrict the position of \(y\): Since the first \(m\) characters of \(w = a^m b^m\) are all \(a\)’s, and \(|xy| \leq m\), the portion \(xy\) consists entirely of \(a\)’s. Therefore \(x = a^p\) and \(y = a^q\) for some \(p \geq 0\), \(q \geq 1\) with \(p + q \leq m\), and \(z = a^{m-p-q} b^m\).
  5. Pump with \(i = 2\): \[xy^2 z = a^p \cdot (a^q)^2 \cdot a^{m-p-q} b^m = a^{m+q} b^m\]
  6. Verify failure: Since \(q \geq 1\), we have \(m + q \neq m\), so \(xy^2 z = a^{m+q} b^m \notin L\) (unequal counts of \(a\)’s and \(b\)’s).
  7. Conclusion: The Pumping Lemma is violated. Therefore \(L\) is not regular. \(\square\)

Additional analysis — all three split cases: The lecture also showed this by exhausting all possible placements of \(y\) in \(w = a^m b^m\) (without assuming \(|xy| \leq m\)):

  • (a) \(y = a^k\): pumping gives \(a^{m+k}b^m \notin L\).
  • (b) \(y = b^k\): pumping gives \(a^m b^{m+k} \notin L\).
  • (c) \(y = a^k b^s\) (\(k, s \geq 1\)): pumping gives \(a^{m-k}(a^k b^s)^2 b^{m-s}\). The middle contains \(b\)’s followed by \(a\)’s, so this string is not of the form \(a^n b^n\). Thus not in \(L\).

In all three cases, pumping fails. \(\square\)

Answer: \(L = \{a^n b^n \mid n > 0\}\) is not regular.

4.6. Prove \(L_2 = \{a^n b\, a^n \mid n \in \mathbb{N}\}\) is Not Regular (Tutorial 5, Example 2)

Use the Pumping Lemma to prove that \(L_2 = \{a^n b\, a^n \mid n \in \mathbb{N}\}\) is not a regular language over \(\Sigma = \{a, b\}\).

Click to see the solution

Key Concept: Choose \(a^m b\, a^m\). The \(b\) in the center is a structural landmark: any valid string in \(L_2\) has exactly one \(b\) in the middle with equal \(a\)-blocks on either side. Pumping the left \(a\)-block destroys this symmetry.

  1. Assume for contradiction that \(L_2\) is regular. Let \(m \geq 1\) be the pumping length.

  2. Choose the word \(w = a^m b\, a^m \in L_2\). Note \(|w| = 2m + 1 \geq m\).

  3. Consider an arbitrary split \(w = xyz\) with \(|xy| \leq m\) and \(|y| \geq 1\).

  4. Restrict \(y\): Since \(|xy| \leq m\) and the first \(m\) characters are all \(a\)’s, \(y\) lies entirely in the leading \(a\)-block. So \(x = a^{m-q_1}\) (for some decomposition), \(y = a^q\) with \(q \geq 1\), \(z = b\, a^m\).

    More precisely: \(x = a^p\), \(y = a^q\), \(z = a^{m-p-q} b\, a^m\) with \(q \geq 1\), \(p + q \leq m\).

  5. Pump with \(i = 2\): \[xy^2 z = a^p \cdot a^{2q} \cdot a^{m-p-q} b\, a^m = a^{m+q} b\, a^m\]

  6. Verify failure: \(xy^2 z = a^{m+q} b\, a^m\). For this to be in \(L_2\), we’d need \(m + q = m\), but \(q \geq 1\) means \(m + q > m\). The left and right \(a\)-blocks have different lengths, so \(xy^2 z \notin L_2\).

  7. Conclusion: Pumping Lemma is violated. \(L_2\) is not regular. \(\square\)

Answer: \(L_2 = \{a^n b\, a^n \mid n \in \mathbb{N}\}\) is not regular.

4.7. Construct a PDA for \(L_1 = \{a^n b^n \mid n \geq 1\}\) (Tutorial 5, Example 3)

Show that \(L_1 = \{a^n b^n \mid n \geq 1\}\) (which is not regular) is recognized by a pushdown automaton. Describe the PDA construction.

Click to see the solution

Key Concept: Use the stack to count \(a\)’s: push one \(A\) per \(a\), then pop one \(A\) per \(b\). If the stack holds exactly \(Z_0\) after all \(b\)’s are consumed, the counts matched.

pda_abn_w5 start p0 p0 start->p0 p1 p1 p0->p1 a, Z₀/AZ₀ p1->p1 a, A/AA p2 p2 p1->p2 b, A/ε p2->p2 b, A/ε p3 p3 p2->p3 ε, Z₀/Z₀

PDA for aⁿbⁿ

  1. Design the PDA:

    • States: \(p_0\) (start), \(p_1\) (reading \(a\)’s), \(p_2\) (reading \(b\)’s), \(p_3\) (accepting)
    • Stack alphabet: \(\Gamma = \{Z_0, A\}\)
  2. Transitions:

    Transition Meaning
    \(p_0 \to p_1\): \(a,\, Z_0 / AZ_0\) First \(a\): push \(A\) onto the stack
    \(p_1 \circlearrowleft\): \(a,\, A / AA\) Each additional \(a\): push \(A\)
    \(p_1 \to p_2\): \(b,\, A / \varepsilon\) First \(b\): pop one \(A\)
    \(p_2 \circlearrowleft\): \(b,\, A / \varepsilon\) Each additional \(b\): pop one \(A\)
    \(p_2 \to p_3\): \(\varepsilon,\, Z_0 / Z_0\) End of input with \(Z_0\) on top: accept
  3. Trace for \(w = "aabb"\):

    • \(p_0 \xrightarrow{a,Z_0/AZ_0} p_1\): stack = \(A, Z_0\)
    • \(p_1 \xrightarrow{a,A/AA} p_1\): stack = \(A, A, Z_0\)
    • \(p_1 \xrightarrow{b,A/\varepsilon} p_2\): stack = \(A, Z_0\)
    • \(p_2 \xrightarrow{b,A/\varepsilon} p_2\): stack = \(Z_0\)
    • \(p_2 \xrightarrow{\varepsilon,Z_0/Z_0} p_3\): ACCEPT \(\checkmark\)
  4. Correctness:

    • On input \(a^n b^n\): exactly \(n\) \(A\)’s are pushed, then \(n\) \(A\)’s are popped. The stack returns to \(Z_0\), and the PDA accepts.
    • On input \(a^n b^k\) with \(k \neq n\): either \(A\)’s remain on the stack when \(b\)’s run out, or \(Z_0\) is encountered before all \(b\)’s are read. Neither reaches \(p_3\) on a valid \(\varepsilon\)-transition.

Answer: The PDA with states \(p_0, p_1, p_2, p_3\) (accepting) and the transitions above recognizes \(L_1 = \{a^n b^n \mid n \geq 1\}\).

4.8. Construct a PDA for \(L_2 = \{a^n b\, a^n \mid n \geq 1\}\) (Tutorial 5, Example 4)

Show that \(L_2 = \{a^n b\, a^n \mid n \geq 1\}\) is recognized by a pushdown automaton.

Click to see the solution

Key Concept: Push one \(A\) per \(a\) before the \(b\), then use the single \(b\) as a pivot, and pop one \(A\) per \(a\) after the \(b\). The stack being empty at the end confirms equal \(a\)-counts.

pda_aba start q0 q0 start->q0 q0->q0 a, Z₀/AZ₀ a, A/AA q1 q1 q0->q1 b, A/A q1->q1 a, A/ε q2 q2 q1->q2 ε, Z₀/Z₀

PDA for aⁿbaⁿ

  1. Design the PDA:

    • States: \(q_0\) (start, reading left \(a\)’s), \(q_1\) (pivot: saw \(b\), now reading right \(a\)’s), \(q_2\) (accepting)
    • Stack alphabet: \(\Gamma = \{Z_0, A\}\)
  2. Transitions:

    Transition Meaning
    \(q_0 \circlearrowleft\): \(a,\, Z_0 / AZ_0\) First \(a\) (left side): push \(A\)
    \(q_0 \circlearrowleft\): \(a,\, A / AA\) Each additional \(a\) (left side): push \(A\)
    \(q_0 \to q_1\): \(b,\, A / A\) See the \(b\): do not pop yet, transition to reading right \(a\)’s
    \(q_1 \circlearrowleft\): \(a,\, A / \varepsilon\) Each \(a\) on the right side: pop one \(A\)
    \(q_1 \to q_2\): \(\varepsilon,\, Z_0 / Z_0\) All \(A\)’s matched; \(Z_0\) on top: accept
  3. Trace for \(w = "aba"\):

    • \(q_0 \xrightarrow{a,Z_0/AZ_0} q_0\): stack = \(A, Z_0\)
    • \(q_0 \xrightarrow{b,A/A} q_1\): stack = \(A, Z_0\) (stack unchanged, state changes)
    • \(q_1 \xrightarrow{a,A/\varepsilon} q_1\): stack = \(Z_0\)
    • \(q_1 \xrightarrow{\varepsilon,Z_0/Z_0} q_2\): ACCEPT \(\checkmark\)
  4. Trace for \(w = "aabaa"\):

    • \(q_0 \xrightarrow{a,Z_0/AZ_0} q_0\): stack = \(A, Z_0\)
    • \(q_0 \xrightarrow{a,A/AA} q_0\): stack = \(A, A, Z_0\)
    • \(q_0 \xrightarrow{b,A/A} q_1\): stack = \(A, A, Z_0\)
    • \(q_1 \xrightarrow{a,A/\varepsilon} q_1\): stack = \(A, Z_0\)
    • \(q_1 \xrightarrow{a,A/\varepsilon} q_1\): stack = \(Z_0\)
    • \(q_1 \xrightarrow{\varepsilon,Z_0/Z_0} q_2\): ACCEPT \(\checkmark\)

Answer: The PDA with states \(q_0, q_1, q_2\) (accepting) and the transitions above recognizes \(L_2 = \{a^n b\, a^n \mid n \geq 1\}\).